A Dataset for Arabic Textual Entailment
نویسنده
چکیده
There are fewer resources for textual entailment (TE) for Arabic than for other languages, and the manpower for constructing such a resource is hard to come by. We describe here a semi-automatic technique for creating a first dataset for TE systems for Arabic using an extension of the ‘headline-lead paragraph’ technique. We also sketch the difficulties inherent in volunteer annotators-based judgment, and describe a regime to ameliorate some of these.
منابع مشابه
Bar Ilan University Applied Textual Entailment
This thesis introduces the applied notion of textual entailment as a generic empirical task that captures major semantic inferences across many applications. Textual entailment addresses semantic inference as a direct mapping between language expressions and abstracts the common semantic inferences as needed for text based Natural Language Processing applications. We define the task and describ...
متن کاملA Lexical Alignment Model for Probabilistic Textual Entailment
This paper describes the Bar-Ilan system participating in the Recognising Textual Entailment Challenge. The paper proposes first a general probabilistic setting that formalizes the notion of textual entailment. We then describe a concrete alignment-based model for lexical entailment, which utilizes web co-occurrence statistics in a bag of words representation. Finally, we report the results of ...
متن کاملRecognizing Textual Entailment Using a Machine Learning Approach
We present our experiments on Recognizing Textual Entailment based on modeling the entailment relation as a classification problem. As features used to classify the entailment pairs we use a symmetric similarity measure and a non-symmetric similarity measure. Our system achieved an accuracy of 66% on the RTE-3 development dataset (with 10-fold cross validation) and accuracy of 63% on the RTE-3 ...
متن کاملNatural Language Inference from Multiple Premises
We define a novel textual entailment task that requires inference over multiple premise sentences. We present a new dataset for this task that minimizes trivial lexical inferences, emphasizes knowledge of everyday events, and presents a more challenging setting for textual entailment. We evaluate several strong neural baselines and analyze how the multiple premise task differs from standard tex...
متن کاملThe Second PASCAL Recognising Textual Entailment Challenge
This paper describes the Second PASCAL Recognising Textual Entailment Challenge (RTE-2).1 We describe the RTE2 dataset and overview the submissions for the challenge. One of the main goals for this year’s dataset was to provide more “realistic” text-hypothesis examples, based mostly on outputs of actual systems. The 23 submissions for the challenge present diverse approaches and research direct...
متن کامل